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INTRODUCTION TO ROOT CAUSE ANALYSIS 
OBJECTIVES 

A. TERMINAL OBJECTIVE 

Improve usefulness of BWR Owner's Group Root Cause Coding database by 
increasing personnel knowledge of Root Cause Analyiis methods, goals, 
and benefits . 

B. ENABLING OBJECTIVES 

1. Identify the characteristics of a Root Cause. 

2. Compare the outcome of investigations done with and without 
Root Cause Analysis techniques. 

3. Compare application of each of the following techniques to 
Root Cause Analysis: 

a) Kepner Tregoe 

b) MORT 

c) Events and Causal Factors 

d) HPES 

4. Evaluate Root Cause Analysis benefits 

a) Benefits of using Root Cause Analysis 

b) Detriments of not using Root Cause Analysis 

5. Determine the strengths and weaknesses of each Root Cause 
Analysis technique . 

a) Kepner Tregoe 

b) MORT 

c) Events and Causal Factors 

d) HPES 

6. Use the BWROG Scram Root Cause Coding Flow Chart as an 
Application of RCA in Categorizing Root Cause and/or Finding 
Root Cause. 
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INTRODUCTION TO ROOT CAUSE ANALYSIS 
OBJECTIVES (cont.) 
Evaluate the outcome of a Root Cause Analysis process for 
completeness, accuracy, consistency with common-sense 
expectations . 

NOTE 

The following objective will be accomplished by the site 
training department using the site's preferred root cause 
analysis methodology. The scenario with some basic root 
causef. is provided, and the method can be exercised under 
the site training department direction. 

Demonstrate Se/eral RCA techniques on sample events. 

a) Use the Basic Root Cause methodology. 

b) Follow us{; of technique on sample event. 

Submitted by 

Approved by 



Re\^iewed by 
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INTRODUCTION TO ROOT CAUSE ANALYSIS 
Background 

The Boiling Water Reactor Owner's Group (BWROG) chartered it's Scram 
Frequency Reduction program to conduct Operations Activity programs such as 
information exchange meetings, engineering design studies for plant 
modifications and plant maintenance practices to reduce the frequency of 
reactor scrams to the NUMARC goal of 3 scrams per year, decreasing to 2 
scrams per year by 1990. The Operations Activity group commissioned this 
introductory training program in Root Cause Analysis. 

Root Cause Analysis, as applied to Scram Frequency Reduction, is a powerful 
tool that involves plant personnel in improving plant operations in a 
directly measurable, high dollar value area; unplanned reactor scrams. 
Incremental removal of root causes by corrective action cumulatively 
improves overall plant operations. 

Questions you'll be able to answer after this course 

1. What is Root Cause Analysis? What is Root Cause? 

2. What are the reasons for doing Root Cause Analysis? What are the 

reasons for doing Root Cause Coding flowcharts? 

3. What techniques are used for Root Cause Analysis? Is any technique 

robust enough to satisfy the goals? 

4. What mistakes have been made in Root Cause Analysis/Coding in the past? 

Where are the opportunities to leverage the results in the future? 

5. Why is your station management concerned enough with Root Cause Analysis 
to train more personnel in this technique? 

This Root Cause Analysis introductory training was developed by the General 
Electric Company, Nuclear Training Services. 
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I. INTRODUCTION TO ROOT CAUSE ANALYSIS 
Section I trdining objectives : 

1. Identify the characteristics of a Root Cause. 

2. Compare the outcome of investigations done with and without Root 
Cause Analysis techniques. 

4. Evaluate Root Cause Analysis benefits 

a) Benefits of using Root Cause Analysis 

b) Detriments of not using Root Cause Analysis 

What is Root Cause Analysis? Quite simply, analysis of data after an event 
to determine the Root Cause(s). And what is the "Root Cause?" According to 
M. Paradies and D. Busch of Savannah River Plant. Root Causes are ''the most 
basic causes that can be reasonably identified and over which the . . . 
management team has control to fix." D. Gano. in "Root Cause and How to 
Find it", says Root Causes are "the most basic reasons for an effect, which 
if corrected will prevent recurrence." Outside the nuclear industry, Root 
Cause is "the must basic cause of an event/problem that, when corrective 
action is taken, prevents recurrence, or, minimizes the effect of recurrence 
of, the event/problem." 

These definitions include the success criteria for corrective reaction to 
root causes. Without success criteria, wt can define root causes as "the 
most basic causes for an effect that can be reasonably identified." 

Using that definition, let's look at two examples. 
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A Non-Power Industry example 

A large bank's staff felt that a significant number of customers waited 
more that two rings for the phone to be answered. Survey results show 
callers neglected for 5 or more rings became irritated, and would not call 
the bank again. Callers answered in two rings or less were reassured and 
comfortable with doing business by phone. 

Rather than recommend more operators, the staff performed root cause 
analysis to determine why the phone rang more than two times before being 
answered. They found that Customer B waited more than two rings while: 

Operator 1 routed Customer A's call; 

--because Operator 2 was on break; 

--because the person Customer A was calling (receiver) 

was unavailable and Operator 1 was unaware; 
--because Customer A's receiver was helping another 

Customer, and no substitute is available; 
--because Customer A could not identify correct 

receiver. 

Operators 1 and 2 .'outed Customer calls; 

--because receiver was unavailable and Operator was 
unaware ; 

--because receiver was helping another Customer, and no 

equivalent is available; 
--because Customer could not identify correct 

receiver. 

The analysis also shows possibilities such as Operator inadequately trained, 
too few Operators, etc. The staff recorded how often each cause resulted 
in a caller kept waiting. The checksheets showed that one op erator out of 
office was the most frequent cause, followed by receiving party not 
present . no substitute available , c ustomer unaware of section a nd name of 
receiver . and several causes grouped under other due to relative 
inf requence . 
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One operator out of office by itself was incomplete, as the frequency study 
and further investigation show the frequency of calls increasing at the 
same time that one operator was out of the office; specifically, lunch 
time! Many customers call the Bank during their lunch hour, and the 
receiver was most frequently unavailable when the phone traffic was 
heaviest, further complicating the problem. 

Figure One shows one version of a cause and effect diagram for this 
problem. The root causes and groupings are: 

Operator 

--Does not understand message; 



Receiver 

--Not at desk; 
- -Out of office ; 
--Absent; 

--Busy with another customer; 
Customer 

--Does not know receiver name or section; 



Operator working system 
--Lunch time rest; 
- -Telephone call rush ; 
- -Absent. 

Other. 



Does not know receiver's job responsibilities; 



Lengthy discussion with operator; 
Complaining; 

Starts to leave a message; 
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A Power Plant example 

After a reactor scram, management asks a team to investigate the cause. 
The event summary: a technician assigned to test instrumenx: channel "A", 
mistakenly connects the test equipment to channel "B" and causes the scram. 
Obviously a personnel error, but we have to do the paperwork. 

Further investigation by the team reveals several interesting facts: 

* the technician usually works in the other reactor plant at the 
same site; 

* the channel designations are not labelled on the test panel; 

* the channel test connectioi-is are mirror- image from the other 
reactor plant panel; 

* the test panel does not have "bypassed channel" indication; 

* the technician is qualified to do work; 

* the designated channel was bypassed correctly; 

* the technician was using the procedure step-by-step. 

Figure 2. shows the simplified cause and effect chart. 

The "obvious personnel error" is contestable, if not totally incorrect, 
considering these facts. 

Are these valid root causes? If compared to the initial results (not 
enough operators, obvious personnel error), the causes are definitely more 
basic. Are they the result of reasonable investigative effort and 
expenditure? Without more detailed information to the contrary, we would 
have to say yes. 
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Corrective Reactions 



Did Root Cause Analysis Pay Off? 



Root cause analysis, as used in the nuclear industry, gives both root 
cause (s) and suggestions for corrective reactions > Let's check the 
examples for content validity by comparing the corrective reactions against 
our definitions . 

The corrective reactions for the non-power industry example may include: 

^ stagger the operator lunch time rest outside peak call times; 

* hire temporary help for peak call times; 

* ensure receivers notify operators of unavailability (and give 
them the tools to do so, like an attendance/location board); 

* cross-train receivers in each other's expertise, so more than 
one resource is available to customers; 

* hook up a dedicated line with a prerecorded messages with 
general information. 

The corrective reactions for the power industry example may include: 

* only allow technicians to work on one unit; 

* label the channel designations on the test panel; 

* move the test connections on one unit, removing the 
mirror- image problem; 

* install channel bypassed indication on test panel; 

Adding the recurrence prevention criterion, both analyses yield suggested 
reactions that will minimize or preclv.de recurrence » The management team 
has power to fix the causes in both examples, satisfying that criteria. 
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Corrective Reactions wichout Root Cause Analysis: 
What's wrong with this picture? 

The proof of valid root cause analysis may be to contrast these causes and 
their corrective reactions with the original causes" and reactions. In 
the first example, hiding more operators is kinder than most reactions in 
real life The existing cpe\ators could have been given time off without 
pay, replaced completely, or subjected to training in how to do their job 
better. In the second exsinple, the real-world results may include all of 
those reactions, plus unnecessary procedure revisions, administrative 
awareress programs to improve operator attentiveness , and, of course, 
several recommendations from operations consultants/advisors. 

Dean Gano from Washington Public Power Supply System quotes the criteria 
for root cause solutions as: 

1. A solution t-hat prevents recurrence; 

2. A solution that is within our control; 

3. A solution that allows us (the power s. .^ion) to meet our 
other objectives, such as to produce power efficiently. 

The reactions in the examples when true root cause analysis is not used do 
not satisfy these criteria. When root cause is not found, not only do 
events recur, but the process becomes less efficient with each repetition. 
Perhaps more importantly, incorrect reactions may not only allow recurrence 
of the same event, they may in fact allow other new events. 

An operator/technician is punished by time off without pay, insulted 
by unnecessary training, or blamed in some other way for a mistake 
made after being "set up" by procedures, design, or policy. He will 
understandably hesitate to cooperate with the process in the future. 
Whe:. u procedure is blamed for problems originating in inadequate 
component design, the procedure changes merely allow someone to 
overcome poor design. When design changes are made to systems that 
merely need clearer procedures, unnecessary funds are spent, and one 
design change may yield more than one effect when subjected to the 
complex interactions of a nuclear power plant. It may be the root 
cause for another event. 
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It becomes clear, as root cause analysis is studied in more depth, that the 
existing "place the blame" methods, and the data from those methods, are 
neither accurate nor helpful. How do we know nhat the data is not helpful; 
The scrams from those "root causes" continue to occur. In the 1983 
Significant Event Report root cause analysis by INPO "Human performance 
problems (44 percent of the total) is the dominant category of the root 
cause. Based on those facts . there should have been a massive drive to 
further -automate plants, moving a dominant "root cause" further away from 
the plant. There was not such a movement, perhaps because training 
improvements were a more palatable alternative, perhaps because there never 
was any faith in those "root causes*'. There are also practical limits on 
the amount of resources, including time, manpower, and money, to make the 
hardware and design changes suggested. Mark Paradies and David Busch, from 
the E.I. du Pont de Nemours Co. Savannah River Plant, said, "We didn't want 
to blane every incident on God (the ultimate root cause) or to blame every 
incident on the operator (a handy root cause because they were there when 
the (incident) took place. In fact, we didn't want to blame anyone." 

Does this mean that human performance "miscues" never cause reactor scrams 
(or other undesirable effects)? No, but there are often other co.itributory , 
if not controlling, factors. 

An event can be coded to personnel error, using the root cause coding 
descriptions from the QWR Owner's Group Scram Frequency Reduction 
Committee, when an event occurs where a well- trained, properly-directed 
person, working in an environment conducive to the task, following an 
accurate effective procedure on correctly designed equipment. . . (etc., 
etc.) makes an error because: (1) lack of concentration/ attention to the 
task being performed led to the error; (2) procedure for the task was not 
followed; or (3) attitude problem resulted in employee not concentrating on 
the task he should have been doing. 
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Current e v..l''5tions show, vhen accurate root cause analysis is used, that 
personnel error due to one of these three problems is only about 10% of the 
total causes of events. The emphasis on blaming personnel is a result of 
not looking deep enough for root cause, not knowing what other things 
contribute to personnel error, and lack of accurate information from 
personnel who are taught to conceal facts by the reaction to openness and 
candor. It is obvious that placing blame on personnel incorrectly leads to 
a weaker RCA process. Personnel learn. 
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Why do Root Cause Analysis? 

Given that the current process is not as effective as possible, let's 
examine why the process is used. This may allow us to determine the 
characteristics of a process that will work. 

Why analyze for the most basic cause? In the aviation industry the answer 
is a litcie more dramatic, if not more obvious; to prevent "events" from 
recurring. As most of us travel by air periodically, this seems to be a 
goal with merit. • 

In the • uclear industry, however, there are several objectives. A primary 
indu, try goal is to gather information available after events so it can be 
used by other applicable units to prevent the same or similar events. This 
is the basis for the BWROG SFRC root cause coding system. The primary goal 
for the event plant is to prevent recurrence at the same facility. The 
goal at forward-looking utilities may be to prevent similar events from 
recurring by simultaneously omitting similar root causes. 

Are these really the goals of Root Cause Analysis? An analogy can be drawn 
with nuclear training's recent history. Despite attention being focussed 
on parts of training, such as task analysis and accreditation, the real 
goal was improved on the job performance. The real reason for Root Cause 
Analysis is not just reducing the number of scrams or unplanned shutdowns, 
or increasing the availability, but improving performance of the plant. It 
is vitally important to keep this in mind when examining the techniques for 
Root Cause Analysis. When selecting an RCA technique, a monitoring 
parameter should be the potential for improving plant performance. In 
other words, the process is the key to improved plant performance, not the 
result (reducing the number of scrams). We discur^ ways to monitor 
effectiveness of Root Cause Analysis in Section IV. 




1-9 



ROOT CAUSE ANALYSIS 



The answer to "Why do root cause analysis?" is to improve plant performance 
by corrective reactions based on accurate root causes. By the way, the 
process can be applied to any event, or to determine why something works so 
well. We are unnecessarily limiting root cause analysis' potential if it 
is just applied to reducing the number of scrams, but that will be 
discussed in more detail in the summary section. 

Mr. Mark Paradies made an interesting comment during a BWROG SFRC 
presentation. He said, "We don't think the big problems, the Three Mile 
Islands, are caused by a single problem ui^^ally. We've beat that down in 
the nuclear industry. We don't have single failure points.... but we do 
have some multiple failure points, and the only way to address those is to 
get rid of as many as possible and to learn as much as you can from this 
operating experience," This seems to summarize the objectives of Root 
Cause Analysis very well. 
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How should I do Root Cause Analysis, 
& How do I know I'm there? 

The Root Cause Analysis (RCA) technique that works best is determined by 
each user, but there are proven methods for the process, regardless of the 
technique chosen. 

In the interest of learning from others, let's examine root cause analysis 
as performed by Electricity de France. 

The French utility, having generic reactor designs (thirty- two 3- loop 
900 MWe units and fourteen 4- loop 1300 MWe units) benefits more from 
sharing information about common components. The payoff from the 
investigative efforts is shared by all common units, and the risk to 
generic units from common failure is much more severe. Therefore, the 
support for operation experience utilization is systematic, company-wide. 

There are four segmpn^s to the organization EdF uses for the "utilization 
and processing of operating experience based on a group-oriented approach." 
The four segments are : 

a structure for gathering information resulting from the failures or 
incidents ; 

a structure for analyzing these incidents for failures and for 
deciding the corrective action to take; 

a system for storing and retrieving the information; 

a system for circulating the information and the corresponding 
decisions and corrective (re)actions (emphasis added). 

When does it start? Usually (unfortunately) after an event. Section IV 
will explain the "unfortunately." 
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Who does it? In most cases, the investigation is done by a team, usually 
independent from the involved personnel. One utility uses a team that 
reports to the plant manager, with "no particular allegiance to any 
department," to gain the most objective viewpoint of the incident. 
Although they make a list of recognized experts that are available when 
needed, that plant recommends the following team makeup: 

* a Human Performance Evaluator; this is a person who is 
specifically trained in evaluating all aspects of human 
performance and documenting the results and recommendations, 
usually in accordance with the INPO Human Performance Evaluation 
system or an equivalent. 

* System/Component Experts; 

* Operations Experts; 

* Discipline Experts (stress analysts, chemists, etc.). 

The Savannah River Plant (SRP) uses an independent group from the Corporate 
organization to investigate the root cause. They are trained in how to 
conduct the personnel interviews, with special attention paid to not 
placing blame on any person during the investigation. 

It is worth noting that the BWROG SFRC feels participation in the root 
cause analysis effort should increase. The more teams of people that input 
information into the analysis, the better the result, as a general rule. 

It is also worth noting a difference in the two examples presented earlier. 
In the bank case study, the staff did the study themselves. In the nuclear 
industry example, a team was tasked with the investigation by management. 
The message in that distinction is subtle but clear; principles of total 
employee involvement result in voluntary, effective, valid improvements. 
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How is the Root Cause Analysis done? The irvestigative team uses one of the 
techniques described in Section 11 to ask questions about the event. 
Several experts tell us that the analysis must start with the Primary 
Effect; typically, the reactor scram. 

The most common technique used in initial RCA is Events and Causal Factors, 
where the Scram is the initial event, and the major causal factors are 
known or easily attainable. The Savannah River Plant uses the Events and 
Causal Factors methoo (developed by EG&G for the Department of Energy) to 
develop a timeline to ensure complete analysis of all events. A tip from 
Washington Public Power Supply System (WPPSS) is to try to develop two or 
more causes (or causal factors) for each event, then determine two or more 
causes for each cause, and so on These methods help ensure that the 
effort to pursue root cause is not abandoned too early. 

In his book, KAIZEN - the Key to Japan's Competitive Success . Masaaki Imai 
states that "problem solvers are told to ask 'why' not once but five times. 
Often the first answer to the problem is not the root cause. Asking why 
several times will dig out several causes, one of whict is usually the root 
cause." Don't stop asking questions too quickly! 

Another caution from the experts is to keep an open mind. Pre-judging the 
root cause(s) dooms the team into conspiring for recurrence. The most 
basic causes must be found to answer each question. 
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There is not one root cause per event, according to several sources. J.L. 
Burton reported in fower Engineering that River Bend Nuclear Station found 
9.5 root causes per scram, average. This should lead the investigators to 
keep probing. 

WPPSS reconunends tying the event and causal factor to a recommendation for 
a solution. They compare the combirdd cause, effect, and solution to the 
root cause criteria discussed earlier. 

The Trench utility. Electricity de France, ensures their investigative 
teams find the same, basic information whether the event ini*'iating the 
investigation is "important-to-Scfety" or "with a bearing on safety and/or 
having an important financial impact". That basic information includes: 

description of the circumstances; 

damage observed; 

assessment of the consequences on the equipment and the plant; 
assessment of the probable root cause and sequence of events; 
immediate corrective action taken; 

suggestions for a definite solution (preventive maintenance, 
modification) or an acceptability of event recurrence. 

When the root causes are found, the goal of the BWROG SFRC can be 
adcressed. The root causes are coded , which allows the "operational 
experience gained" to be shared with others. The only condition on this 
wealth of experience is that it be understandable. The standardization of 
root cause coding, such as that proposed by the BV7R0G SFRC, will allow 
industry-wide sharing of experience. Every plant participating can 
accumulate hundreds of reactor years of operating experience annually. 
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SRP uses a peer review group to ensure the RCA and coding are complete and 
accurate. The primary coding engineer presents the events and causal 
factors charting, the root cause analysis, tLe rooc cause coding, and the 
investigation history, and the peer review group does a QA check. 
Sometimes additional investigation is required, and sometimes the process 
is just exchange of information. The group must reach consensus on the 
root cause before the causes are recorded. 

How do we know when we are done? To assist the investigator, SRP assigns 
names to three levels of cause. There are six basic root cause categories, 
with near root cause at the next lower level, and finally the root cause. 
By their method, the investigator knows by the coding when root cause is 
reached. SRP also acknowledges that root cause is not always identified. 

Most RCA methods don't stop at finding the root causes. The investigation 
team is not disbanded until the event reaction recommendations or solutions 
are made. The investigative team recommendations are given to the 
management team, which prioritizes the recurrence prevention reactions. 
The investigation team should track the effectiveness of the reactions, as 
they are intimately familiar with the causes and the recommendations. 

One of the critical points in the root cause, corrective reaction 
determination, and correction implementation process is confirming the 
results. Finding root causes and matching corrective reactions is futile 
if the event recurs, or if the root causes precipitate another event. The 
highly respected Dr. Deming, a Quality Control specialist, fit the process 
into a loop, termed the PDCA cycle (Plan, Do, Check, Action). The 
corrective reaction is the Plan, the implementation is the Do, the Check is 
confirming the results, and Action means "preventing recurrence and 
institutionalizing the improvement as a new practice to improve upon." At 
this point, the practice enters the loop again. We will discuss the 
Deming cycle more in Section IV. 

Now that we know the methods and goals for Root Cause Analysis, let's 
discuss the techniques commonly used for Root Cause Analysis in more depth. 
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II. ROOT CAUSE ANALYSIS TECHNIQUES 



Section II training objectives : 



3. Recognize the potential for applying one or more of the following 
techniques when performing Root Cause Analysis: 

a) Kepner Tregoe 

b) Management Oversight cr.i Risk Tree (MORT) system 

c) Events and Causal Factors charting 

d) Human Performance Evaluation System (HPES) 

e) Cause-and-Ef feet technique 

f) Root Cause Coding Flow Chart method 

5. Describe the limitations of RCA techniques with regard to finding 
practical solutions. 

Root Cause Analysis is nothing more than a series of questions; First, 
what happened?, when did it happen?, who caused it to happen?, and most 
importantly why did it happen? Unfortunately in some cases, while 
performing a root cause analysis, it appears that the only question asked 
is; who caused it to happen? Several techniques or tools may be used to 
find the answers to these questions. Systematic techniques seek to 
consistently solve these questions so the root causes of an event can be 
found. 



All analysis techtiiques have only one goal; to determine the root causes of 
an event. Once these causes have been determined, corrective reactions are 
developed to prevent their recurrence. It is not significant which 
technique or tool is used to perform Root Clause Analysis. If performed 
properly, all methods should point to the same causes for any given event. 

These methods or techniques, as well as those used at a specific plant, are 
systematic approaches for investigating an event or incident. As such, 
these methods are guidelines to determine the questions to ask, and 
identify when a root cause is achieved. 
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ROOT CATISR \NALYSIS TECHNIQUES (cont.) 

The root cause analysis tools commonly used are: 

a) Kepner Tregoe 

b) Management Oversight and Risk Tree (MORT) system 

c) Events and Causal Factors charting 

d) Human Performance Evaluation System (HPES) 

e) Cause-and-Effect technique 

f) Root Cause Coding Flew Chart method 

This list is not all inclusive; other tools and/or methods can be developed 
to perform the same functions. As stated earlier, it doesn't matter which 
tool is used as long as the root causes of an event can be determined. In 
most cases, the methods that have been developed are derived from portions 
of the Kepner - Tre goe , MORT, Events and Causal Factors, or Cause-and-Effect 
techniques, or some combination of these. 
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KEPNER TREGOE TECHNIQUE 

The Kepner Tregoe (KT) method is a systematic, logical, method of revolving 
concerns. This method, or a portion of this process, is used by almost 
everyone, even those who have never received formal training in it. The KT 
method labels, and arranges in a logical sequence, the normal thought 
processes commonly used when making a decision or solving a problem. 

The KT method i*. divided into four, smaller processes. One process is 
called Situation Appraisal. This process sorts out complex or ill defined 
situations. With the situation properly sorted, and its associated 
concerns prioritized, you can determine v;hich of the other three processes 
to enter; Decision Analysis, Potential Problem Analysis, or Problem 
Analysis. 

Decision Analysis, as its name implies, is used when a decision must be 
made. This process shifts the focus from alternatives to the objectives 
which must be met by a decision. By carefully defining the objectives, a 
more carefully reasoned decision based on information and analysis can be 
made . 

Potential Problem Analysis helps anticipate the difficulties that may arise 
when any decision or action plan is implemented. This process also helps 
determine if plans need to be developed which will protect the original 
decision or action plan, if the foreseen difficulties do occur. 

As can be seen in the above paragraphs, the decision analysis and potential 
problem analysis processes relate to a root cause analysis when cau^^es have 
been determined and a corrective reaction plan has been developed to 
prevent recurrence. The situation appraisal process can identify the 
events which require root cause analysis. 



ROOT CAUSE ANALYSIS 



KEPNER TREGOE TECHNIQUE (cont.) 



The fourth process of the KT method most closely relates to the actual 
performance of a root cause analysis. This process is called Problem 
Analysis (sometimes referred to as Change Analysis). Problem analysis is a 
systematic process for finding the cause of a deviation and is made up of 
three basic steps. The deviation, as regards to root cause analysis, is 
the event or incident which is to be analyzed. 



The steps which make up the problem analysis process of the KT method are: 



1. Describe the Problem 



The problem is described by clearly stating the deviation, or 
stating what £>hould have occurred and what actually occurred. 

As an aid in clearly stating the deviation, information should be 
gathered to answer the following questions: 

a. What is the deviation(s)? 

b. Where is the deviation(s)? 

c. When did the deviation(s) occur? 

d. To what extent did the deviation(s) occur? 

With this information in place, the next step of clearly 
understanding the deviation is to develop an IS and IS NOT 
comparison chart. This chart should contain wh'^t, where, when, 
and to what extent the deviation(s) IS along with what, where, 
when, and to what extent the deviation(s) IS NOT. 

2. List the Po&sible Causes 

This second basic step of the Problem Analysis process develops a 
list of possible causes for the specified deviation. This list is 
generated by listing the distinctions and/or changes that have 
occurred between the items of the IS and IS NOT lists. The causes 
of the distinctions or changes are then investigated. 
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KEPNER TREGOE TECHNIQUE (cont.) 



3. Finding the True Cause (s) 



The last basic step of the Problem Analysis process is finding the 
true cause of the deviation. This step tests the list of possible 
causes for the most probable causes. This is done by comparing all 
of the possible causes with the obseirved specifics (the IS/IS NOT 
chart) of the deviation. If the cause could produce all of the 
same observed specifics, it can be classified as a probable cause. 



When all the probable causes have been determined, then the True 
Cause must be found and verified. This is done by further 
investigation, experimentation, observation, etc. of the most 
probable causes. 



As shown, the KT technique for performing a root cause analysis does 
provide the basic benefits of a good analysis tool. This technique 
is a structured guideline to an investigator in determining the information 
needed^ the questions to ask, and when to stop; i.e., when the root causes 
have been achieved. 



The major drawback to this technique when performing root cause analysis or 
determining their corrective reactions is, as in any "thought*' process, 
extensive training in the technique is required and constant practice in 
its use is necessary. Also, a significant amount of time, energy and 
resc'irces may be required in the verification of the true causes of the 
event . 



This method, however, does provide a good base for the development of a 
more specific analysis tool to find the root causes of reactor plant 
events . 
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CAUSE-AND-EFFECT TECHNIQUE 

This technique in determining root cause depends on only two items; the 
definition of a root cause, and the question: " Why did this effect or event 
occur?". As such, this tool is very easy to use and is only limited to the 
knowledge and experience xevels of the user. 

The definition of a root cause is fundamentally important to the use of 
this technique. The definition used determines the criteria to be met by 
any root cause developed by this technique. For ejfample; the root cause is 
defined as: The most basic reasons for an event, which if corrected will 
prevent recurrence. This definition tells us that a root cause must be 
correctable; if it isn't, it may be considered a cause but not ^ root 
cause; and the correction must also prevent event recurrence. It is 
implied that the correction must be within our control, and allow us to 
meet our other goals or objectives. Other criteria could also be derived 
from this definition, depending on where within the structure of the 
analysis it is usee.. 

Using the CAUSE-AND-EFFECT technique is simply starting with the most 
significant event and determining the cause(s) of it. The cause(s) for 
this event's cause(s) are then determined, and this chain of events and 
causes is continued until no other causes can be determined. These causes 
are then verified by determining if the root cause criteria have been met. 
For example: 

The most significant event is a reactor trip from the reactor 
protection system (RPS). Therefore, why did the reactor trip from RPS? 
Answer: due to actuation of the RPS low water level switches. Why did 
the RPS low water leve] «;witches actuate? Answer: due to low reactor 
water level. Why was there a low reactor water level? , etc. 
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CAUSE-AND-EFFECT TECHNIQUE (cont.) 



Causes are not always as f5traightforvard as those in this example. In most 
cases, the causes found for each event depend on the investigator's 
experience and knowledge levels. Therefore, when using this method for 
determining root cause, it is strongly suggested that an expert team 
perform the analysis. This broadens the experience and knowledge used in 
conducting the investigation and determination. 

As the causes are being determined for ea^'.h event, it is also suggested 
that a corrective reaction or solution be prescribed for each cause. This 
gives the investigati/e team a benchmark for determining when the root 
cause has been reached. When a reasonable solution, which can be 
controlled or implemented by management, is reached then the associated 
cause may be called a valid root cause. 

The primary drawback to this technique is the implied suggestion that only 
one solution can correct a root cause. It also lays a significant burde.. 
on the investigative team, in that a "reasonable" solution determined by 
them may not be an "acceptable" solution for management to implemei.j 
Extreme care needs to be taken to prevent "short cuts" or predetermined 
assumptions from occurring when performing this technique as well. As can 
be seen in the example, each event must be listed as a single item and only 
provable facts or qualified judgements are used for the associated causes. 

An addendum to this technique strongly suggests that the investigative team 
provide at least two causes for each event/effect. This requirement 
ensures that all possible causes are considered for a single event and no 
"root causes" are overlooked. 
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EVENTS AND CAUSAL FACTORS CHARTING 

This technique was originally developed by the National Transportation 
Safety Board (NTSB) as an analytical tool for accident Investigati n. This 
method creates a chart (or diagram) which depicts, in a logical sequence, 
the events and their causal factors that lead to an accident occurrence. 
With very little or no modification this method may be used for root cause 
analysis in a nuclear power plant. 

This method can be used by itself as a mechanism for performing a root 
cause analysis, but is often used in conjunction with one or more of the 
cause coding tree methods discussed later in this section. In application, 
the events and causal factors flow charging begins in much the same manner 
as the cause-and-effect technique discussed earlier. 

Starting with the most significant event, i.e. a reactor trip, a sequence 
of factual or observed events is built which lead to the most significant 
event. These events should be written to meet the following suggested 
criteria: 

1. Each event should describe an occurrence or happening and not a 
condition, state, circumstance, issue, conclusion, or a result; 
i.e., "the pipe ruptured", not "the pipe had a crack in it". 

2. Each event should be described by a short sentence with one 
subject and one action verb; i.e., "mechanic checked valve 
fastener tightness", not "mechanic checked valve fastener 
tightness and opened valve". 

3. Each event should be precisely describe^'' i.e., "operator placed 
pump switch to START", not "operator started the pump". 

4. Each event should be quantified when possible; i.e., "reactor 
water level decr'iased by 36 inches", not "reactor water level 
decreased" . 



1-23 



ROOT CAUSE ANALYSIS 



EVENTS AND CAUSAL FACTORS CtuJlTING (cont.) 

5. Each event should be derived directly from the event(s) and 

conditions preceding it; i.e., "operator placed pump switch to 
START" which then goes to "operator verified normal pump discharge 
pressure reading of 800 psig," then "operator placed discharge 
valve switch to OPEN" which goes to "discharge piping pressure 
increased to 800 psig" which leads to "the pipe ruptured" that 
goes to "reactor water level decreased by 36 inches", etc. Each 
event is derived logically from the one preceding it, if this is 
not the case, it usually indicates that one or more steps of the 
sequence have been left out. 

These single events are then investigated to determine their cause, or 
"contributing factors". When all the events and their contributing factors 
leading to the incident have been determined, they are placed into their 
proper sequence to form a time line of the accident or incident. 

This time line is then charted using the following suggested format: 

1. Events should be enclosed in rectangle and connected together, in 
sequence, with sclid arrows. 

2. The sequence of events should be depicted in a straight horizontal 
line with the events arranged chronologically from left to right. 

3. If there are any secondary events, or event sequences, these 
should also be connected to the prjn:?ry sequence of events in 
their chronological order. 

4. Contributing conditions and factors should be enclosed in ovals 
and connected with each another or with their associated events by 
dashed lines or arrows. 

The events and their contributing factors should track in a logical pro- 
gression from beginning to end of the incident/accident sequence. 
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EVENTS AND CAUSAL FACTORS CHARTING (cunt.) 



The contributing factors are then evaluated, one at a time, to determine 
which factors, if prevented, would have prevented or significantly 
mitigated the incident. These contributing factors are called *'causal 
factors". The causal factors are annotated on the chart by small 
triangles. If contributing factors (or causal factors) of an event are 
developed further, they will lead to the root causes of the incident. 

The main drawback to this system in determining root causes, and the 
associated corrective reactions is that no specific guidance is given on 
how much further to develop the causal factors found above. 

As stated earlier, this method is usually used in conjunction with one of 
the cause coding tree techniques. In this respect, the causal factors are 
used as the starting point for coding. Each causal factor is coded to the 
lowest possible level for which there are answers in the appropriate tree. 
At that point the root causes have been determined. 

Using Events and Causal Factors charting has three benefits. It meets the 
objectives of incident investigation by determining what happened and why 
it happened, to prevent the same or similar occurrences in the future. It 
helps conduct the investigation by showing the need for in-depth analysis, 
illustrating multiple causes and the chronology of events, and visually 
portraying the interactions and relationships of all involved individuals 
and organizations. It aids in writing the investigation report jy checking 
investigative logic completeness, identifying matters requiring further 
investigation, and differentiating between the analysis of the facts and 
the resultant conclusions. 

These benefits result when the following seven key elements are met when 
applying this technique. 
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EVENTS AND CAUSAL FACTORS CHARTING (cont.) 

1. Begin the technique as soon as accumulating factual information 
about th^j incident is started, 

2. Use the suggested guidelines as a method for getting started and 
for staying on track with the investigation. 

3. Proceed logically using all available data. 

4. Use an easily updated chart, as additional facts and conditions 
are continually discovered during the investigation. 

5. Validate the results of this method with otb'^r investigative 
tools . 

6. Select the appropriate level of detail to investigate, if not 
already suggested by the investigation appointment authority. 

7. Condense the Events and Causal Factors chart into a short 
executive summary chart whenever it is necessary to refer to a 
concise and easy-to-follow version of the incident sequence. 
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MANAGEMENT OVERSIGHT AND RISK TREE (MORT) SYSTEM 



The management oversight and risk tree (MORT) syste-n is an event oriented, 
working tool which can perform two functions. The MORT system can 
determine the root causes of an accident or incident that has occurred, and 
to evaluate an existing safety program to determine the likelihood that a 
significant accident is about to occur. 

In order to perform these functions the MORT system incorporates four basic 
key features: 

1. An analytical "logic tree" or diagram which arranges safety 
program elements in an orderly, coherent, and logical manner. 

2. A schematic representation of an "ideal" safety system model by 
using Fault Tree Analysis methodology. 

3. A methodology for analyzing a specific safety program. 

4. A collection of philosophical statements and general advice 
relative to the application of the MORT system safety concepts and 
a listed criteria which can be used to measure the effectiveness 
of their application. 

As we are only concerned with Root Cause Analysis, only the use of the MORT 
system as it pertains to analyzing accidents and incidents will be 
addressed. It should be noted however, that this system is also an 
effective management tool in evaluating and developing specific safety 
programs within the industry. 

As stated earlier, the MORT system supplies a "logic" tree which allows f< r 
determining, using a visual display, the root cause(s) of an accident or 
incident. This tree uses some standard "logic" symbols within its body tc 
control the investigator's path as he works through the tree to determine 
the cause(s) of the accident. These symbols are: 
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MANAGEMENT OVERSIGHT AND RISK TREE (MORT) SYSTEM (cont.) 

1. Rectangle This symbol encloses an event. Either the first 

event or those events resulting from the 
combination of more basic events acting through 
logic gates. 

2. AND Gate Use of this symbol indicates that all of the input 
Symbol values (normally found ar. the bottom of the symbol) 

must be present, in order to lead to the output 
(condition or event). 

J. OR Gate Use of this symbol indicates that only one of the 

Symbol input values (normally found at the bottom of the 

symbol) must be present in order to lead to the 
output (condition or event). 

4. Oval This symbol encloses a condition or constraint that 

is connected to either an event block or to one of 
the gate symbols. When connected to a gate symbol, 
the stated condition or constraint will specify how 
and when the gate will function. 

5. Risk Symbol Indicates that the investigator should transfer to 

the "Assumed Risk*' branch of the tree. It is used 
for problems with no known or practical 
countermeasures . 

6. Triangle This symbol indicates a connection or transfer from 

one branch of the tree to another. The "transfer 
out" symbol (triangle with a line connected to one 
of its legs) normally contains a number or code 
which transfers tne inve'stigator to another branch 
via a "transfer in" symbol (triangle with a line 
connected to one of its points) containing the same 
number or code . 
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MANAGEMENT OVERSIGHT AND RISK TREE (MORT) SYSTEM (cont.) 

This s)anbol encloses an event described by a basic 
component or part failure. This event is 
independent of other events within the tree. 

This symbol encloses an event that has not been 
developed to its cause. The sequence is usually 
terminated for a lack of information or lack of 
consequences from the event. 

This symbol encloses an event that is normally 
expected to occur. 

Encloses an event that is satisfactory. This is 
normally used to show the completion of a logical 
analysis , 

The MORT diagram is entered at its TREE TOP with the event box, marked with 
a T, which specifies the losses that occurred. Since the diagram can also 
be used as an evaluation tool, a second event box, connected by dashed 
lines, for future undesired events may be used to enter the tree. 

Following entry into the tree, the first decision point is reached (as 
indicated by an OR logic gate). Was the loss that was incurred the result 
of an Oversight and/or Omission or was it an assumed risk? All events are 
considered to be an Oversight and/or Omission unless the investigator has 
been specifically informed by upper management that it was an assumed risk. 
Following the oversight and omission event box the logic tree, the 
investigator is directed into two branches by an AND logic gate. One 
branch specifically addresses the management factors associatfc-1 with the 
accident or incident. The other branch addresses the specific control 
factors, human and mechanical, that were involved. 



7. Circle 



8. Diamond 



9. Scroll 

10. Stretched 
Circle 
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MANAGEMENT OVERSIGHT AND RISK TREE (MORT) SYSTEM (contj 

Followinj; the specific controls (S) branch leads the investigator to the 
event box labeled "acciient". From the accident event box the main body of 
the MORT chart is entered. The investigator works downward, through the 
tree, following each connecting branch until the questions posed by the 
circled statements of the chart are answered either "yes" or "no". At this 
point the analysis ends. 

As can be seen from the description above the MORT system provides all of 
the benefits of using a logical, visual, and analytical method for 
performing root cause analysis. The primary drawbacks with this system, 
and developing the resulting corrective reactions, is the time it takes to 
learn and to use the system. The MORT tree is extremely large, due to the 
number of items it encompasses, and difficult to use for specific accident 
or incident evaluations. 

The advantages of using the MORT tree, however, are apparent. This system 
ensures that each cause is considered and provides a good visual di play of 
the path of the investigation. Because of these significant advantages the 
MORT system has been used as a model for other "tree" type root cause 
charting techniques. 
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HUMAN PERFORMANCE EVALUATION SYSTEM (HPES) 

Following the accident at Three Mile Island Unit 2, industry wide screening 
and analysis ol plant events intensified. The results of these analyses 
revealed the frequent presence of human error. Human errcr, due to today's 
complex technology and organizational structures, can be caused by many 
external factors. The objective of the human performance evaluation system 
(HPES) is to improve overall plant operations by reducing human error 
through correcting the conditions which cause these errors. 

This technique in determining root causes evaluates the human performance 
during an accident or incident in a reactor plant. The technique uses 
three basic analyses to determine the root causes of an event. These 
analyses are performed in conjunction with filling out established forms 
which direct the investigator to the appropriate information required to 
complete these analyses. These analyses are; 

1. Situational Analysis 

This analysis determines when, where, and what event happened, as 
well as the job category, experience level, work schedule, and the 
general task(s) that plant individuals were working when the event 
occurred. 
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HUMAN PERFORMANCE EVALUATION SYSTEM (HPES) (cont.) 
Causal Factor Analysis 

This analysis determines the causal factors, with regards to human 
factors, that effected the event. The causal factors are 
found by grading the appropriate elements on charts which cover 
the following categories: 



a. 


Communications (both written and verbal) 


b. 


Interface design or equipment condition 


c , 


Environmental conditions 


d. 


Work schedule and practices 


e . 


Work organization and/or planning 


f . 


Supervisory methods 


g- 


Training and/or qualification methods and content 


h. 


Change management 


i. 


Resource management 




Managerial methods 



Behavioral Factor Analysis 

This analysis tells how the event happened by gradxng a series of 
causes within the following categories: 

a. The type of inappropriate action that occurred. 

b. The behavioral function in which the inappropriate action 
occurred. 

c. The internal factors affecting the ability to order/direct, 
sense, interpret, or to act. 

d. The external factors affecting the ability to order/direct, 
sense, interpret, or to act. 

e. The behavioral shaping factors (causal categories) that led 
to the inappropriate actions. 
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HUMAN PERFORMANCE EVALUATION SYSTEM (HPES) (cont.) 



This program augments and supports line management's function of managing 
human performance and carries the bonus of strengthening the plant team 
relationship. The nonpunitive reporting climate, fostered by this program, 
leads to more error reporting and frequently to the correction of 
\inderlying causes prior to an actual event. The utility also benefits 
through an increase in employee job satisfaction, resulting from fewer task 
errors and from employee participation in solving identified problems. 

The major drawback to this system, and its resulting corrective reactions, 
is that the system primarily deals with root causes in a human factors 
methodology. Other causes could be easily overlooked if they do not fit 
with the "human performance" framework. 
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ROOT CAUSE CODING FLOW CHART TECHNIQUE 

This method combines two fevious techniques, the events and causal factors 
flowcharting and an abbreviated MORT chart. The method was devised by ' ^ 
BWR Owner's Group for consistency ii? reporting and storing root causes 
reiijrtor events. Consistency in charting among the Owner's Group utilities 
will allow for easier understanding oL the root causes and their corrective 
reactions to other members of the Group. 

This technique starts with determining the causal factors of an event using 
the events and causal factors charting method, disc .ssed earlier. Wien the 
event's causal factors have been determined, each factor Is coded through 
the E 1 Owner's Group cause coding chart. This chart is similar to the 
MORT tree, discussed earlier. 

The MORT tree has been abbreviated and idjusted tv. correspond to the 
specific concern^ of a BWR plant. ' such, the cause coding tree starts 
with a causal factor. From this starting point three major categories may 
be entered; Equipment Malfunction, Personnel Miscue, and/or Act of 
Nature/Man. The major categories are further divided into subcategories, 

1, Act o^ Nature/Man contain; 

a. Acts of Nature which includes' 

- Lightning 

- Flood 

- Tornado/Wind 

- Hurricane 

- Icing 

- Aquatic Life 

- Seismic 
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ROOT CAUSE CODING FLOW CHART TECHNIQUE (cont.) 
Act of Nature/Man contain (cont.); 
b. Man-made Cause which includes: 

- Electrical Grid Failure 

- Crash of an Airplane 

- Sabotage 

- Vandalism 

Personnel Miscue contain; 

a. Operations 

b. Technical Support 

c. Maintenance 

These subcategories each contain the same eight components 
which are: 

- Procedures 

- Communications 

- Training 

- Human Fac tors 

- Management System 

- Immediate Supervision 

- Quality Assurance (QA> 

- Personnel 

A note is attached to the personnel section of these 
subcate^ ^ies which directs that this section of the chart 
should only be used if no other cause can be found. This 
an effort to prevent the investigator from taking the "eas 
way out of performing his evaluation. 
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ROOT CAUSE CODING FLX)W CHART TECHNIQUE (cont.) 
3. Equ pment Malfunction cont-^ins: 

a. Operation 

b . Maintenance 

c. Equipment Reliability and/or Design 

d. Construction and/or Fabrication Modification 

These subcategories are also further divided into sections. 

Each section within a subcategory is further divided until the root cause 
level for that subcategory is reached. When using this chart, each causal 
factor of an event is coded to the lowest possible level in the tree and 
then the next causal factor is coded. Sometimes it will not be possible to 
code a factor all tht -*ay down to the root cause iBvel of the tree. In 
that case, the coding should stop at the lowest level of the tree for which 
the questions can be answered. At other times a causal factor may result 
in two or more root causes being coded from the tree. This result is 
satisfactory since, in most cases, there probably are more than one root 
cause that needs to be establ^' shed. 
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Inputs 
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conditional Input Is sstlsfled. 
Description of condition Is 
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O 



COHSTRAIWT Syifeol 
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problems with no known or prac- 
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Figure 4. MORT Event Symbois 
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Figure e. Cause Codirg Tree 
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III. IDOT CADSE AHALTSIS SCBRABIO EXERCISE 



Section III training objective ; 



8. Demonstrate RCA techniques on sample events using the Basic Root 
Cause methodology. 



Two short scenarios are provided to Introduce application of RCA techniques and 
the Root Cause Coding Flow Chart. 

The scenarios were chosen tc keep detailed plant i^peclflc Information at a 
minimum. Plant specific Information » when used. Is explained. 

The material In this section should be supplemented with previously analyzed 
events which occurred at your plant. 
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Section IV. ROOT CAUSE ANALYSIS OVERVIEW AND SUMMARY 
Section IV training obj ectives: 

7. Evaluate the Outcome of a Root Cause Analysis process for 
completeness, accuracy, and consistency with common-sense 
expectations . 

The earlier sections of this text describe the definitions, techniques, 
strengths and weaknesses, and outcome of Root Cause Analysis, and exercised 
a technique. In this section we will define some parameters to measure the 
outcome, and discuss the potential future benefits of Root Cause Analysis. 

Measuring the RCA Process: 
The Outcome vs. the Expectation 

As stated earlier, the real objective of the Root Cause Analysis (RCA) 
process is: to improve plant performance by corrective reactions based on 
accurate root causes. We can derive the parameters to evaluate Root Cause 
Analysis outcome from this objective; accurate root cause, corrective 
reactions, and improved plant performance. 

ACCURATE ROOT CAUSE 

Specific measurements within "accurate root cause" might be ALL root 
causes , the ri^ht root cause, and something the Japanese call warusa-kaRen . 

Finding ALL root causes might seem to be an unachievable goal, but it is 
not unreasonable. A valid reason for attempting to find ALL root causes is 
to not succeed. Any investigator who believes ALL root causes are found 
has stopped short. If we define a problem as an opportunity for 
iiii^tuvemeuL, we Can State in the inverse that where there is room for 
improvement there are problems, and therefore are root causes. A plant 
that finds ALL root causes has no room for improvement. 
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The ri ght root cause is similarly difficult to mep-ure. Perhaps the proof 
is in the alternative; DON"T find the wrong cause! It is well docunientcd 
that finding the wrong root cause can be more damaging than not finding the 
right cause. When ALL the root causes are found, ensure that the subset of 
ri ght root causes is complete. Challenge every root cause to break it down 
into its most basic components. Check to ensure a root cause can, and did, 
cause the event. Ensure the root cause, in isolation, can cause the event. 
It may be a contributing factor without being a root cause. 

Finally, root cause accuracy should address the concept of warusa-kagen. 
This is a Japanese term that refers to things that are not really problems 
but are somehow not quite right. If warusa-kagen are not fixed, they may 
develop into serious trouble and cause substantial damage. In the context 
of Root Cause Analysis, warusa-kagen may comprise facts discovered during 
the RCA which are not root causes, maybe not even contributing factors, but 
which are things that are not quite right . A component of accurately 
determining root cause should include documenting these things. It helps 
us meet the objective of RCA. 

CORRECTIVE REACTIONS 

Ideal corrective reactions have several descriptors: prevent recurrence; 
don't cause other events; within our control; allow other objectives to be 
accomplished. 

Preventing recurrence of the subject event is a major reason for Root Cause 
Analysis and Root Cause Coding. Although this usually means preventing the 
same event at the same station/plant, it may be preferable to expand the 
scope, perhaps to all plants at the same statior , all similar plants owned 
by the same utility, all plants in that model line (e.g., BWR/6) , all the 
plants supplied by that NSSS vendor, etc. Whatever scope is chosen for 
measurement, the temptation to reduce the scope should be avoided, and when 
scope is expanded that action should be rewarded. 
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Avoid scopes such as preventing recurrence of all scrams from Scram 
Discharge Volume instrument rack shaking re sulting from contractor drilling 
for modification equipment installation tha t occur during the summer neak 
loads . An exaggeration, to be sure, but it demonstrates the logic of 
expanding the recurrence prevention scope. A more useful example of 
corrective reaction might be to prevent recurrence of sc rams caused by 
contractor modification drilling bv only doing gnr>^ work in sensitive areas 
during outages, disarming sensitive instruments in the area (if allowed by 
specifications, of course), or finding a non- disruptive me thod of 
performing modification. 

Events caused by the corrective reaction to another event are something we 
try to avoid, and is mentioned only to remind us of the possibility. 
Corrective reactions should be active in recurrence prevention, and passive 
for another event as both a root cause or as a contributing factor. 

A plant must meet objectives to be economically feasible. The corrective 
reactions must prioritized according to the corporate objectives and 
policies. In the best of times, no": many root cause derived corrective 
reactions have to wait for a significant geological -theological event, like 
hell freezing over, to be implemented, but there are realistic limits on 
the corrective reactions, such as the ability to make plant modifications 
during a particular condition of the plant. The cost of implementing 
corrective reactions is also a concern, but caution should be paid to 
saving a dollar this quarter, only to pay ten dollars for replacement power 
next quarter. Shortsightedness costs, in event recurrence. Reality says 
personnel promotion is largely based on quarterly cost control, yet 
Operations and Maintenance costs continue to rise steadily. Does long-term 
plant performance deserve as much attention as short-term cost control? 
Absolutely . 
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Jean-Paul Bemer of Electricity de France said "An accurate determination of 
the root cause of a failure will allow the utility to consider different 
corrective (re)actions corresponding to different technical solutions." 
The implied message is that, as mentioned earlier, management reserves che 
right to manage and make those decisions. Unless the root cause analysis 
methods are thorough in finding all the root causes, the managCiHent 
decision is based on incomplete information. No one wants their manag^^r to 
make decisions that could affect long-term financial healti. of their 
company based on incomplete information. 
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IMPROVED PLANT PERFORMANCE 

Plant performance improvements can be measured in several ways, all of 
which have u rit. Some of the more common performance measures are less 
Licensee Event Reports, more availability, and, of course, lees scrams. 

Licensee Event Reports may actually increase as a result of the implementa- 
tion of Root Cause Analysis programs INITIALLY. When increasing the 
awareness and participation of plant personnel in the procer*^ of 
determining root causes, implementing corrective reactions and monitoring 
the performance of the plant related to tha. cause and event, the number of 
reportable events may rise just due to more things being noticed. However, 
as corrective reactions are implemented and monitored, the events should 
decrease. If not. the effectiveness of th RCA process should be 
reexamined. 

Availability of the plant to generate electrical power is near and dear to 
plant management's heart, as it should be. Availability is easily measured 
(although there seems to be several "standard" methods to calculate it by), 
and is highly visible. Corrective reactions must be weighed against the 
effect on availability, and again the caution against the short-cerm effect 
is valid. 

Reducing the number of scrams is the goal of this course's sponsor. The 
BWR Owner's Group Scram Frequency Reduction Committee efforts in achieving 
specific goals related to the number of scrams are manifold. The NUMARC 
goals are specific targets and the Japanese plants are often referred to 
as models. 
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In his discussion of Root Cause Analysis at the BWROG SFRC meeting. Dean 
Gano of WPPSS mentions the Japanese power plant statistics for scram 
frequency. "First of all, they've only had 97 scrams in their entire 
history of power operation of nuclear power plants ... ninety-seven, that's 
all they've ever had. Also, personnel error, where an individual made a 
mistake, only happened every two or three years. So, there's something 
there. I don't know what it is. It may be cultural." 

Mark Paradles recalls the philosophy of Japanese plant management from a 
speech delivered at a previous meeting; "The Japanese plant philosophy on 
how to prevent scrams ... and it was, you beat it to death. You didn't 
start that plant up again until you addressed what I would call the root 
causes, AND YOU FIXED THEM. If it took thirty days, it took thirty days. 
You didn't start back up again until you had it fixed, because you weren't 
going to have that next scram happen again, ever, period." Mark went on to 
relate how vendor representatives, kept on call 24 hours a day, were called 
immediately to find out why that plant scrammed and get it fixed, because 
they weren't going to start that plant up until it was fixed. 

Mr. Zenzaburo Katayama. the assistant manager at Toyota Motor's Total 
Quality Control Department, gave an example of the "culture*' relating to 
plant shutdown. 

"At Toyota, we stop the entire line when we find a defective part. 
Since all plant operations are coordinated, it means that when one 
plant stops, the effect ripples back to the pre^'ious process, and 
eventually the Kamigo plant, which manufactures engines, stops too. 
If the stoppage is prolonged, all the plants have to stop operation. 

Stopping the plant is a serious blow to management. And yet, we dare 
to stop it because we believe in quality control. Once we have taken 
the trouble of stopping the plant operation, we have to make sure that 
we find the cause of the trouble and adopt a countermeasure so that 
the trouble never recurs." 
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The cultural difference Dean Gano referred to deserves further 
consideration. One cultural difference is the Japanese attitude toward 
constant improvement. This process of continual incremental improvement is 
something the Japanese call kaizen. In hi^ book, KAIZEN - The Kev to 
Japan's Competitive Success . Masaaki Imai says, "The essence of KAIZEN is 
simple and straightforward:KAIZEN means improvement. Moreover, KAIZEN 
means ongoing improvement involving everyone, including both managers and 
workers. The KAIZEN philosophy assumes that our way of life - be it our 
working life, our social life, or our home life - deserves to be c^^^stantly 
improved." Let's see how Root Cause Analysis' objectives can be satisfied 
more efficiently with KAIZEN principles. 
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KAIZEN 



Key to Future Root Cause Analysis? 



One irore time; what is the goal of Root Cause Analysis? to improve plant 
ptrformance by corrective reactions based on accurate root causes. RCA is 
just one effort to improve plant performance. Some of the difficulties and 
shortcomings of Root Cause Analysis have been briefly mentioned earlier in 
the text. Let's examine how RCA can be improved, with KAIZEN in mind, 
which logically should result in improved plant performance by improving 
the process. 

Dean Gano emphasizes the importance of the knowledge level of the expert 
team performing the investigation. The facts gathered in their RCA efforts 
need interpretation, the expert members' role. However, this merely 
touches on the potential application. The facts gathered by the personnel 
involved in the incident overwhelm the facts gathered by mechanical or 
electronic means. The message is simple; get everyone involved. Romember, 
KAIZEN is ongoing improvement involving eve.ryone . Any effort by an expert 
team dwindles in comparison to a team effort by the personnel involved in 
the actual evenc. Even the interviewer concept is weakened by the time 
required to establish the kind of trust necessary for openness and honesty. 
The team, working together, should be able to reconstruct the event 
efficiently, as they will counter and question and prompt each other during 
the process. Have the participants in the event participate in the RCA 
process . 

A question worth exploring no%' is, Why don't people participate now? There 
is probably not an RCA investigator anywhere who has a "KEEP OUT'' sign 
posted on his/her door. Why doesn't anyone come in, except by mandate? It 
might be because no one asked, or because nothing has ever been done about 
suggestions (or problems pointed out) before, or because the result favored 
for openness and honesty is punishment, either direct or indirect. This is 
not a question to bs answered by anyone other than the plant staff. 
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However, it is a question that must be answered to make any form of RCA 
effective. Another quote from M. Imai may hold a key: "I would suggest 
that information rots . . . Information that is collected but not properly 
used rots rapidly. Any manager who does not forward the information to the 
interested parties, and any management that does not have a system to use 
information, is doing e great disser'Mce to the company and creating 
massive waste in the form of lost opportunities and wasted executive 
time." 

Another reason for non participation is the lack of participation in 
forming the corrective reaction, and the resulting lack of ownership, to 
the extreme of taking delight in failure of non-owned reaction. "The 
permanent approach (to Group-Oriented KAIZEN) . . . calls for the full PDCA 
cycle (Plan, Do, Check, Action) and demands that team members not only 
identify problem areas but also identify the causes, analyze them, 
implement and test new countermeasures , and establish new standards and/or 
procedures." Dean Gano touched on this subject ac his presentation at the 
BWROG SFRC meeting, saying that since they started getting the operation 
personnel involved in the root cause analysis and the solution, there are a 
lot lers complaints about the "stupid" causes and solutions found when only 
the non-operations staff was involved. 

Last, but not least, when the KAIZEN concept of warusa-kagen becomes part 
of che Root Cause Analysis process, not only are the actions reactive, they 
are preventive. When preventative actions outnumber corrective reactioi . . 
Root Cause Analysis will have accomplished its full potential. 
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Abstract for: APPENDIX B, TRIP INVESTIGATION/ROC T CAUSE DETERMINATION 
PROGRAM written by/for the Babcock & Wilcox Owners 
Group 

This document describes the benefits, and contains the overall guidelines 
to the Babcock & Wilcox (BCW) Owners Group utilities, for establishing a 
thorough event Investigation and root cause determlnaVlon program. These 
guidelines suggest the amount and type of on-site and off-site resources 
which the utilities should use, when developing and Implementing this type 
of program. 

The document also suggests the conditions under which this program should 
be activated, as well as, the tools, techniques and the analysis processes 
the program should contain. These program elements will allow the 
utilities to Identify the causal factors of reactor trips, plant 
transients, and other performance anomalies. Once the causal factors are 
Identified, recommendations for effective corrective action can be made and 
]5rlorltlzed to most effectively prevent the recurrence of these events. 

TAKE AWAY ITEMS: 

o Guidelines provide specific program elements for Identifying causal 
factors . 

o Identifying causal factors allows: 

- effective corrective actions ^o be recommended 

- prioritizing these corrective actions 

o Dedicated team of Investigators, outside normal organization Is 
recommended to provide : 

- reliability 

- account ab 11 1 ty 

- objectivity 

- broad spectrum of plant operating expertise 

- peer review and consultation features 
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Develop plan for supplementing utility's investigation with outside 
resources . 

- Using transient categories defined in Transient Assessment 
Program Description, assistance can be given to determine 
causal factors or the more serious events. 

Minimum conditions a root cause investigation should be actuated are: 

- an unplanned reactor trip 

" planned trips where expected post- trip response doesn't occur 

- safety system actuations per INPO pe*-formance indicators 

- equipment malfunctions which degrade or prevent control of the basic 
control functions or result in an unexpected transient. 

Develop a procedure to provide a written guide that contains the 
following process elements. 

- Obtain factual information relevant to the event from 
sources which should include (but are not limited to): 

* Personnel Interviews 

* Recorded Instrument Data 

* Computer Alarm Printouts 

* Procedures 

* Logs 

* Transient Monitor Data 

* Completed Work Requests 

* Previous Event Reports 

* Interoffice Correspondence 

* NPRDS and other data bases 

- Clearly reconstruct the event 

- Using a structured analytical tool identify: 

* the less obvious causal factors 

* the conditions 

* all pertinent events/actions 

- Classify entire event categorize significant causal factors 

* allows for effective data base entry 
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o Develop a proce'iure to provide a written guide that contains the 
following process elements. (Continued) 

- Document iny corrective actions needed to prevent recurrence of the 
event. 

- Generic issues should be communicated to other B&W utilities. 

o Management Oversight and Risk Tree (MORT) and Event and Causal Factor 
Charting are recommended for use. 

- are supported by training and implementation materials 
from outside sources and are used and supported by INPO. 

o MORT has two meanings pertinent to the B&W program: 

- Total safety program concept focused on programmatic 
control of industrial safety hazards. 

- An actual logic diagram which displays the structural set 
of interrelated safety progi'am elements and concepts. 

o As a safety management system, MORT was designed to: 

- Prevent safety-related oversights, errors, and omissions 

- Identify, assess, and refer residual risks to proper 
management levels for appropriate actions. 

- Optimize allocation of resources to the safety program. 

o MORT encompasses several specific tools and techniques, two have been 
selected for implementation. Event and Causal Factor Charting and MORT 
Tree Analysis. 

o Change analysis in MORT, incorporates concepts of Kepner- iregoe method 
so it isn't used. 

o Other tools and techniques may be adopted at a later date. 
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Abstract for: SAVANNAH RIVER EXPERIENCE USING A CAUSE CODING TREE TO 
IDENTIFY THE ROOT CAUSE OF AN INCIDENT written by Mark 
Paradies and David Busch in October, 1986. 

This document describes the Cause Coding Tree developed at the Savannah 
River Plant by their Reactor Safety Evaluation Division. This Cause Coding 
Tree was developed to systematically evaluate incidents at the Savannah 
River Plant, identify their root causes, record these root causes, and 
analyze the trends of these causes. By providing a systematic method to 
identify correctable root causes, the system helps the incident 
investigator to ask the right questions during the investigation. It also 
provides the independent safety analysis group and management with 
statistics that indicace existing and developing trouble spots. 

A description of the Savannah River P"' ^.nt Cause Coding Tree is included in 
the article, as well as, some discussion of the differences, and the 
reasons for these differences, between it and the systems it was drawn 
from . 

TAKE AWAY ITEMS 

o New system was created from the best parts of: 

- INPO's Human Performance Evaluation System's (HPES's) root cause 
analysis 

- INPO's Significant Event Report root cause identification system 

- EG&G's Management Oversight and Risk Tree (MORT) system for root 
cause identification 

- Methods used at Savannah River Plant to identify incident causes 

- Events and Causal Factors Charting 

o Root Cause defined as: The most basic cause that can reasonably be 
identified and that management has control to fix. 

o When enough questions are asked it becomes easy to specify corrective 
actions to fix system problerDS. 
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o The criteria used to develop the Savannah River Plant Cause Coding 
Tree was developed considering the plants needs and included; 

- Make system usable with current Incident investigation system. 

- Make system point to the root cause of incident or as close as 
possible. 

- Make system provided statistics answer the questions Savannah River 
Plant wants to answer now, but flexible so that if different 
questions arise in future, system will be able to provide statistics 
to answer them. 

- Make system easy for the beginner to use. 

o Three methods used to make use of varied backgrounds but still arrive 
at a standardized coding. 

- Require a "Events and Causal Factors Chart" for each incident. 

* Helps to logically analyze incident 

* Determine if facts of incident have been uncovered 

* Relate corrective actions to causal factors 

- Hold group peer reviews of coding of each event. 

* Incorporates various expertise of personnel 

* Causes are recorded after group reaches consensus :n the 
cause(s; . 

- Developed list of "repeat failures" to standardize this coding part. 
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Abstract for: USING ROOT CAUSE ANALYSIS OF OPERATING EXPERIENCE TO 
IMPROVE MANAGEMENT'S PERFORMANCE written by Mark 
Paradies and David Busch 

One of the most important factors in the operation of the plant, the 
plant's management, is often left with very little feedback on their own 
performance. This paper shows a technique to provide management feedback 
on their performance. The technique, developed to help improve the safety 
performance of the reactors at Savannah River Plant, involves the use of 
Events and Causal Factors Charting in combination with a Root Cause Coding 
Tree to analyze plant incidents. This analysis provides data that can be 
used to identify developing problem areas and correct the root causes so 
that similar incidents can be avoided in the future. The advantages 
observed at Savannah River Plant since implementing this system are also 
provided. 
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Accuidte feedback on management performance is critical to good, safe 
system operation. 

Describes a system for analyzing the root cause of incidents 
- Two techniques are combined to identify all root causes. 

* Events and Causal Factors Charting is used to determine causal 
factors that were contributory causes to the incident. 

* Contributory causes are analyzed with Root Cause Coding Tree to 
identify the root causes of the incident. 



o Two main ways of using the data from this analysis process: 

- Design measures that will prevent the recurrence of a specific 
incident . 

• Look for trends in the root causes over a period of time 

* Used to go from a specific problem identified in a 
particular incident to generic (system) problems. 

* Can predict growing problem areas which require 
correction before any more specific problems occur. 

^"7 
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Operating experience can provide management performance feedback by: 

- Provides a Basic Root Cause Category that deals with three 
main functions of management. 

* Setting standards and policies, and developing 
administrative controls to prevent incidents. 

* Auditing the use of standards and controls and 
ensuring that the standards are applied. 

* Taking timely, effective corrective action to fix 
discrepancies . 

- By reviewing trends, management can see where m^^re 
resources are required to improve the plant's performance. 

Benefits derived from system implementation at Savannah River Plant: 

- Provided data that confirmed beliefs previously not supported by 
hard facts. 

* Problems easier to recognize and address. 

- Started a trend away from placing blame on those involved 

and toward finding corrective actions that prevent recurrence. 

* Increased trust and cooperation between managers and operators. 

- Provides investigators with a systematic investigation methodology. 

* Aids in determining the types of questions to ask. 

* Graphically shows when a root cause is achieved. 

- Corrective actions to identified problems easier to see. 

* Higher percentages of actions adopted because they appear 



- Provides another method for management to measure its performance. 

* Goals can be set and monitored. 

* Performance trended from year to year. 

* Isolated areas of improvement can be identified. 



obvious . 
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Abstract for: USER'S GUIDE FOR REACTCA INCIDENT ROOT CAUSE COPING 
TREE written by Mark Paradies and David Uusch in 
December, 19£ j 

The Reactor Incident Cause Coding Tree Is designed to allow Identification 
of root causes of reactor Incidents at the Savannah River Plant, thereby 
leading to trending of useful Information and the development of corrective 
actions to prevent their recurrence. This document defines the terminology 
of the Reactor Incident Cause Coding Tree at the Savannah River Plant and 
explains how to use this tree In a step-by- step manner. 

TAKE AWAY ITEMS 

o Guide allows consistency of coding among all Incident Invostlgators . 

o First, find all causal factors for which root causes need to be 
determined. 

- Causal factors are actions or failures that, If 
eliminated, would have prevented the Incident from 
occurring or significantly mitigated It. 

o Determine root cause.<^ of causal factors with Root Cause Coding Tree. 

o Tree has six levels (A through F) 

- Least detail cause near top of tree 

- Most detailed cause near bottom of tree 

o Each Causal Factor is coded, one at a time, starting at the top of the 
tree and working down as far as known information will allow. 

o The lowest level of codaMe detail should be listed as the root 
cause (s ; . 
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Abstract for: ROOT CAUSE ANALYSIS EXECUTIVE SUMMARY written by/for 
the BWR Owners' Group (BWROG) 

This document, along with a letter from C.L. Larson of GE to B. Williamson 
of TVA entitled; "SFRC Root Cause Analysis, Evc^nts and Causal Factors 
Charting", describes the usefulness of charting the causal factors leading 
to a reactor scram in determining the root causes for that scram. Events 
and Causal Factors Charting, as described in the report DOE/SSDC-76- '5/14 , 
Events and Causal Factors Charting, and adapted from the Savannah River 
Plant's "User's Guide for Reactor Incident Root Cause Coding Tree" 
document, is a technique for logically displaying the events related to a 
scram, illustrating multiple causes of a scram, and ensuring that the 
investigation of a scram has not overlooked any causes. This type of 
charting also helps in identifying where corrective actions are needed. 

As described, the methodology of developing the Events and Causal Factors 
Charts is not significant though some consistency of charting causal 
factors is recommended. This will make understanding easier for all 
members of the BWR Owners' Group. These documents provide a method, with a 
few general rules, for charting the causal factors leading or contributing 
to a reactor incident. Also included in these docurents is an example of a 
Cause Coding Tree . 

TAKE Av7AY ITEMS 

o Events and Causal Factors Chart should contain all details of the 
incident investigation. 

- allows identification of all contributing causes 

- allows eventurl identification of all root causes 

o Sequence of the incident is laid out in a time line with each event 
leading to it 

- Events are investigated to determine their cause(s). 

- Causal factors of the incident are determined. 
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o Root cause(s) of the causal factor(s) is deterh;5ned using the cause 
coding tree. 

- Each causal factor is coded to the lowest possible level 
in the tree . 

- Stop at the lowest level o£ the tree for which the 
questions can be answered if its not possible to code to 
the root cause level of the tree. 



o At times a causal factor will result in two or more root causes being 
coded from the cause coding tree. 
- Called dual coding 
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Abstract for: EVENTS AND CAUSAL FACTORS CHARTING (DOE 76-45/14. 

SSDC-14) prepared by: JR Buys and JL Clark 

This document discusses the goal of the Department of Energy (DOE) to build 
and maintain an accident investigation process that utilizes state-of- 
the-art investigative and analytical methods. This process is used to 
identify the various causes of an accident occurrence so that action can be 
taken to prevent their recurrence. The document also discusses the nature 
of accident investigation and describes the technique of Events and Causal 
Factors charting as an investigative tool. 

Within the description of the technique of Events and Causal Factors 
charting, a general format is suggested and thvi criteria for determiri5ng 
the events which make up the accident sequence is given. This document 
also contains a typical application ("simple" accident) as an example for 
using Events and Causal Factors charting. 

The document also discusses the seven key elements in the practical 
application of this technique and the benefits derived from using Events 
and Causal Factors. 

TAKE AWAY ITEMS 

o Vital factors in accident causation emerge as sequentially and/or 
simultaneously occurring events, which interact with existing 
conditions to form a multifactorial path to the accident. 

o Two basic foundation principles are suggested: 

- Acci:!ents are the results of successive events that 
produce unintentional harm. 

- The accident sequence occurs during the conduct of some 
work activity 
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Experience shows that accidents are rarely simple and almost never 
result from a single cause. 
Usuaxly multifactorial 

- Develop from a series of events which include: 

* performance errors 

* changes 

* oversights 

* omissions 

Events and Causal Factors chart should begin as soon as investigator 
starts gathering factual data of the accident. 

- Several benefits for starting chart quickly: 

* Organizes the accident data 

* Helps in guiding the investigation 

* Aids in validating and confirming the true accident sequence 

* Helps identify and validate the factual findings, probable 
causes and contributing factors 

* Aids in simplifying the investigation report 

* Illustrates the accident sequence in the investigation report 

Most effective when used with other MORT tools that provide supportive 
correlation. 

Use whatever method of application of this technique that seems to 
work best. 

Suggested Format: 

" Events should be enclosed in rectangles, conditions in ovals. 

- Connect events with solid " ines 

- Connect conditions with dashed lines 

- Base events and conditions on factual evidence, presumptive items 
should be denoted by dashed line rectangles and ovals 

- Primary sequence of events depicted in straight horizontal 
line joined by bold printed arrows 

- Secondary event sequence , contributing factors, or systeiric factors 
depleted in horizontal lines above or below the primary sequence 

- Arrange events in chronological order from left to right 

- Events should track in a logical progression 
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o Suggested criteria for event descriptions 

- Describe an occurrence or happening and not a condition, 
state, circumstance, Issue, conclusion, or result 

- A short sentence containing only one subject and one 
active verb 

- Event should be precisely described 

- Event should be quantified when possible 

- Each event should be derived directly from the evei;t(s) 
and conditioners) preceding It 

o Benefits of the Events and Causal Factors charting technique: 

- Meets the general purposes of accident Investigation 

* Provides a cause -oriented explanation 

* Provides a basis for beneficial changes to prevent recurrence 

* Helps delineate areas of responsibility 

* Ensures objectivity in investigation conduct 

* Provides a quantita ".ive data organization 

* Provides an operational training tool 

* Provides an effective aid for future systems design 

- Helps in conducting the investigation 

* Aids in developing evidence, detecting all causal 
factors and in determining the need for in-depth 
analysis . 

* Clarifies reasoning 

* Illustrates multiple causes 

* Visually poirrays interactions and relationships of 
involved organizations and individuals 

* Illustrates the chronological sequence cf events 

* Provides flexibility in Interpretation and 
summarization of collected data 

* Communicates facts in a logical and orderly manner 
^ Links specific factors to organizational and 

management control factors 
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o Benefits of the Zvents and Causal Factors charting technique (Cont.)- 
- Helps in writing the investigation report 

* Provides a check for completion of investigat. ve logic 

* Provides a method for identifying matters requiring 
further investigation or analysis 

* Provides a logical display of facts where valid 
conclusions can be drawn 

* Provides consistent subject titles for "discussion of 
facts" and "analysis" paragraphs 

* Provides method for determining if purpose and specific 
objectives of the investigation have been met 

* Provides differentiation between analysis of facts 
and conclusions reached 

* Simple method of describing the accident sequence and 
its causes to a reading audience of different 
backgrounds 

* Source of identification of organizational needs and 
the formulation of recommendations to meet those needs 

* Provides a method for evaluating the factual basis of 
possible recommendations 

* Useful in solving unanticipated problems with 
preparing the final report of specific accident 
investigations 

o Seven key elements in the practical application of Events 
and Causal Factors charting: 

- Begin early 

- Use the gulcik^Aines 

- Proceed logically using available data 

* Use an easily updated format 

- Correlate with other MORT investigative tools 

- Incluoe appropriate detail and sequence length 

- Make a short executive summary chart when necessary 
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Abstract for: HUMAN PERFORMANCE EVALUATION SYSTEM written for INPO in 
August 1984 

The Institute of Nuclear Power Operations (INPO), working with several 
member and participant utilities, has developed a nonpunitive program 
designed to identify, evaluate, and correct situations that involve human 
performance errors. The program is called Human Performance Evaluation 
System (HPES). Its primary goal is to improve human reliability in overall 
plant operations by reducing human error through correction of the 
conditions that cause the errors. 

This document describes the Human Performance Evaluation System goals, its 
scope, the methodology and the benefits derived from using this system. 
Also included within this document are the various forms used when 
performing the Human Performance Evaluation System after an event, 

TAKE AWAY ITEMS 

o Program was founded on the following premises: 

- Human error can be reduced and minimized 

- Causes of minor events are often the same as those for major events 

- Management is of key importance 

* People want to perform well 

* Punitive actions often do not correct underlying 
causes and discourages reporting of mistakes 

- Identification and correction of causes can prevent repeat 
events and reduce opportunities for similar e\ ants 

- Utility sharing of lessons learned promotes better plant and 
industrial understanding and correction of human error causes 

o Root cause analysis methodology does not ask who did it; but asks 
what, where, when, how, and why it happened. 
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Program's primary implementation elements are: 

- Reporting 

- Analysis 

- Corrective Actions 

- Feedback 

The following plant personnel are involved In program implementation 

- Line management 

* Uses program results to resolve causes of human 
performance problems 

- Reporters 

* All personnel who report human error events to the 
program coordinator 

- Program coordinator 

* Specially trained individual who analyzes events, 
determines their causes, recommends corrective 
actions, and provides feedback to reporters 

- Evaluators 

* Specially trained individuals who assist the program 
coordinator in evaluating human performance problems 

INPO provides the following support for program implementation: 

- Training of program coordinators and evaluators 

- Program implementation assistance 

- Report screening 

* A reviewer pr')vides com.Tienti. or experience-based 
information from other pc^rtic pants 

- Maintenance of a human performance data base 

- Regular data base analysis and feedback 

- A quartei.ly newsletter focusing on human performance 

- Operation of an information exchange network 

- Meetings to discuss lessons learned, new developments, and 
advanced evaluation techniques 

- Sponsorship/support of related workshops 

- Materials for program operation and training 

- Annual reviews of program methodology and effectiveness 
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Abstract for: METHOD IDENTIFIES ROOT CAUSES OF NUCLEAR REACTOR SCRAMS 
an article written by JL Burton for Power Engineering 
magazine in October 1987 

'^his article discusses the evaluation done at the Gulf Sta* ^s Utilities' 
River Bend Station in which a root cause analysis was performed for each 
unplanned reactor scram. The utilities' Independent Safety Engineering 
Group (ISEG) analyzed all River Bend Station scrams not due cO testing that 
have been experienced from the plant's initial criticality date through 
December of 1986. 

This analysis was intended to provide three results, which were: 

1) Identify trends relating to scrams to focus attention on problem 
areas . 

2) Identify and rank scram root causes to allow management allocation of 
resources in the most effective manner possible. 

3) Identify corrective actions for the scram root causes to prevent their 
recurrence, thereby reducing scram frequency. 



This article also discusses the Management Oversight and Risk Tree (MORT) 
technique that was used to perform the root cause analysis of this 
evaluation. 



TAKE AWAY ITEMS 



o MORT techniques were modified slightly 

- An importance score was assigned to each oot cause, based 
on its level of contribution to the scrai. 

- These scores were weigh>.ed by frequency of occurrence to 
develop root cause rankings for each scram and for a 
composite of all scrams 
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o 



MORT Is based on the concept chat risks are a combination of 
three distinct eleraex»ns: 

- a hazard (or energy release) 

- a target which can be damaged by the hazard 

- or^ or morf; barriers which separate the hazard from the 



* Undesirable effects (accidents or scrams) occur due 
to the breakdown of these barriers allowing a hazard 
to reach Its target 

The extreme level of detail and Its complex structure limits the 
feasibility of using the MORT technique for the analysis of numerous 
events 

All scrams analyzed, could be traced to more than one root cause 
- an average of 9.5 root causes per scram were Identified 



target 
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Abstract for: ROOT CAUSE AND HOW TO FIND IT written by Dean Gano of 
WPPSS 

This article documents what has been learr.'^d through participation in the 
BWR Owners Group (BWROG) Scram Frequency Reduction Committee (SFRC) over 
the past two years. This docum nt provides an in-depth discussion of the 
definition of root cause, the use of the cause-and-ef feet process to find 
the root cause, and the use of proper cause categorization as a means to 
better understand the nuances of root cause. It also provides a detailed 
statistical breakdown of reactor trips at boiling water reactors for 1986 
as compiled from v^fR Owners' Group Scram Frequency Reduction Committee 
data. 

TAKE AWAY ITEMS 

o Root cause definition: The most basic reason(s) for an effect, which 
if corrected will prevent recurrence. 

o Method used to determine root cause is unimportant, as long as goal 
(to orevent recurrence) is achieved. 

o Root cause criteria: 

- A solution that prevents recurrence 

- A solution that is within our control 

- A solution that allows us to meet other objectives 

o Root cause process: 

- Use an expert team 

- Start with the primary effect (reactor trip) 

- Use cause-and-ef feet process in conjunction with the root 
cause criteria 

- After a root cause has been determined, apply the 
definition of a root cause and verify it 

Cause categorization provides an order for counting and comparing 
similar recurring events. 

SO 
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Three major cause categories; 

- People 

- Procedures 

- Hardware 

Subcategories for Personnel Error (37X of 1986 BWR scrams) 

- Procedures not followed 

- Training deficiency 

- Lack of mental attention 

• Programmatic deficiencies 

- Communication deficiencies 

Subcategories for Procedural Error (16X of 1986 BWR scrams) 

- Procedure incomplete or nonexistent 

- Incorrect procedure information 

Subcategories for Equipment Failure (48% of 1986 BWR scrams) 

- Design deficiency 

- Maintenance deficiency 

- Premature wearout 

- Installation/manufacturer deficiency 
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Abstract for: ROOT CAUSE COPE DETERMINATIONS letter from CL Larson of GE 
Reliability Engineering Services to liWROG Scram Frequency 
Reduction Systems Design Activity of February 1988 

This letter contains a revision to the cause coding tree and a docuraenr 
which explains each box of the tree. The changes to the tree were mostly 
cosmetic and also included the elimination of duplication. 

TAKE AWAY ITEMS 

o SFRC cause coding tree has four major categories: 

- Equipment Malfunction 

- Human Miscue 

- Procedural Deficiency 

- Other 

o The equipment malfunction^ category has three subcategories: 

- Design 

- Maintenance 

- Fabrication did not meet design 

o The hum£.ii miscue category has four subcategories: 

- Operations 

- Technical Support 

- Maintenance 

- Installation/Modification 

o The procedural deficiency category has four subcategories: 

- Operations 

- Techni'^al Support 
" Maintenam e 

- Installati m/Modif ication 

o The other category has only two subcategories: 

- Acts of nature 

- other 82 
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